智慧應用 影音

生成式AI開箱終極攻略:如何在 Amazon SageMaker JumpStart 上探索和微調生成式 AI 模型 Llama 2

  • DIGITIMES / 台北
  • 2024-06-24 18:11:29
Meta 開發的 Llama 2 基礎模型現已在 Amazon SageMaker JumpStart 中提供
Meta 開發的 Llama 2 基礎模型現已透過 Amazon SageMaker JumpStart 提供給客戶進行微調和部署。Llama 2 大型語言模型 (LLM) 家族是一系列預先訓練和微調的生成文字模型,其規模從 70 億到 700 億參數不等。經過微調的 LLM,稱為 Llama-2-chat,已針對對話使用案例進行優化。您可以輕鬆試用這些模型,並透過 SageMaker JumpStart 使用它們,SageMaker JumpStart 是一個機器學習 (ML) 中樞,提供對演算法、模型和 ML 解決方案的存取,讓您可以快速開始使用 ML。本文將逐步介紹如何透過 SageMaker JumpStart 探索、部署和微調 Llama 2 模型。

什麼是 Llama 2

Llama 2 是一種使用優化的 transformer 架構的自回歸語言模型。Llama 2 旨在用於英語的商業和研究用途。它提供一系列參數大小 - 70 億、130 億和 700 億,以及預先訓練和微調的變體。根據 Meta 的說法,經過調整的版本使用監督式微調 (SFT) 和強化學習()RLHF與人類反饋相結合, 來符合人類對於有用性和安全性的偏好。Llama 2 是在公開可用來源的 2 萬億個token資料上預訓練。經過微調的模型適用於類似助理的聊天,而預訓練模型則可調整用於各種自然語言生成任務。無論開發人員使用哪個版本的模型,Meta 的負責任使用指南都可以協助指導可能需要的額外微調,以使用適當的安全緩解措施來定制和優化模型。

什麼是 SageMaker JumpStart

透過 SageMaker JumpStart,機器學習從業人員可以選擇部署和定制一系列公開可用基礎模型。機器學習從業人員可以將基礎模型部署到位於網路隔離的專用的 Amazon SageMaker 實例,並使用 SageMaker 進行模型訓練和部署來自訂模型。
您現在可以在 Amazon SageMaker Studio中透過幾次點擊 或通過 SageMaker Python SDK 以程序化的設計方式發現和部署 Llama 2,從而讓您可以從 SageMaker 功能 (如 Amazon SageMaker PipelinesAmazon SageMaker Debugger 或容器日誌) 獲得模型效能和 MLOps 控制。該模型部署在 AWS 安全環境中,並在您的 VPC 控制下,有助於確保資料安全。
Llama 2 模型目前可在
  • 可微調: us-east-1”, “us-west-2”,“eu-west-1”
  • 僅推理: "us-west-2", "us-east-1", "us-east-2", "eu-west-1", "ap-southeast-1", "ap-southeast-2"

探索模型

您可以透過 SageMaker Studio UI 和 SageMaker Python SDK 存取 SageMaker JumpStart 中的基礎模型。在本節中,我們將介紹如何在 SageMaker Studio 中探索這些模型。
SageMaker Studio 是一個整合式開發環境 (IDE),提供單一的基於網頁的視覺介面,您可以在其中存取專門設計的工具來執行所有 ML 開發步驟,從準備資料到建置、訓練和部署您的 ML 模型。如需開始使用和設定 SageMaker Studio 的更多詳細資訊,請參閱 Amazon SageMaker Studio
一旦進入 SageMaker Studio,您就可以在 預建和自動化解決方案 下存取包含預先訓練的模型、筆記本和預建解決方案的 SageMaker JumpStart。
https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/07/17/ML-15102-image001.jpg
從 SageMaker JumpStart 登陸頁面,您可以瀏覽解決方案、模型、筆記本和其他資源。您可以在 基礎模型: 文字生成 轉盤中找到兩個旗艦 Llama 2 模型。
如果您看不到 Llama 2 模型,請關閉並重新啟動以更新您的 SageMaker Studio 版本。如需版本更新的更多資訊,請參閱 關閉和更新 Studio 應用程式
https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/07/17/ML-15102-image003.jpg
您也可以選擇 探索所有文字生成模型 或在搜尋框中搜尋 llama 來找到其他四種模型變體。
https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/07/17/ML-15102-image005.jpg
您可以選擇模型卡片來檢視該模型的詳細資訊,例如授權、用於訓練的資料,以及如何使用。您也可以找到兩個按鈕 部署開啟筆記本,它們可協助您使用該模型。
https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/07/18/ML-15102-image007-new.png
當您選擇任一按鈕時,將會顯示最終使用者授權合約和可接受使用政策的快顯視窗,供您確認。
https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/07/17/ML-15102-image009.jpg
確認後,您將進入下一步以使用該模型。

部署模型

當您選擇 部署 並確認條款後,模型部署將開始。或者,您也可以透過選擇 開啟筆記本 顯示的範例筆記本來部署。該範例筆記本提供了如何部署模型進行推理以及清理資源的端到端指導。
要使用筆記本進行部署,我們首先選擇適當的模型,由 model_id 指定。您可以使用以下程式碼在 SageMaker 上部署任何選定的模型:
from sagemaker.jumpstart.model import JumpStartModel my_model = JumpStartModel(model_id = "meta-textgeneration-llama-2-70b-f") predictor = my_model.deploy()
這會使用預設設定 (包括預設實例類型和預設 VPC 設定) 在 SageMaker 上部署模型。您可以透過在 JumpStartModel 中指定非預設值來變更這些設定。部署後,您可以透過 SageMaker 預測器對已部署的端點執行推理:
payload = { "inputs": [ [ {"role": "system", "content": "Always answer with Haiku"}, {"role": "user", "content": "I am going to Paris, what should I see?"}, ] ], "parameters":{"max_new_tokens":256, "top_p":0.9, "temperature":0.6} }
經過微調的聊天模型 (Llama-2-7b-chat、Llama-2-13b-chat、Llama-2-70b-chat) 接受使用者與聊天助理之間的聊天記錄,並產生後續的聊天內容。預先訓練的模型 (Llama-2-7b、Llama-2-13b、Llama-2-70b) 需要一個字串提示,並對提供的提示完成執行文字。請參閱以下程式碼:
predictor.predict(payload, custom_attributes="accept_eula=true")
請注意,預設情況下 accept_eula 設為 false。您需要將 accept_eula 設為 true 才能成功調用端點。這樣做就表示您就接受了前面提到的使用者授權合約和可接受使用政策。您也可以下載授權合約。
用於傳遞 EULA 的 Custom_attributes 是鍵/值對。鍵和值由 = 分隔,對則由 ; 分隔。如果使用者傳遞相同的鍵多次,則保留最後一個值並傳遞給指令碼處理程序 (即在本例中用於條件邏輯)。例如,如果傳遞 accept_eula=false; accept_eula=true 給伺服器,則 accept_eula=true 將被保留並傳遞給指令碼處理程序。
推理參數控制端點上的文字生成過程。Maxium new tokenscontrol是指模型生成的輸出大小。請注意,這不同於單字數量,因為模型的詞彙並不等同於英語詞彙,每個token可能不是英語單字。溫度控制輸出的隨機性。溫度越高,輸出就越有創意和虛構成分。所有推理參數都是可選的。
下表列出了 SageMaker JumpStart 中提供的所有 Llama 模型,以及每個模型的 model_id、預設實例類型和支援的最大總token數 (輸入token數和生成token數之總和)。
模型名稱模型 ID最大總代幣數預設實例類型
Llama-2-7bmeta-textgeneration-llama-2-7b4096ml.g5.2xlarge
Llama-2-7b-chatmeta-textgeneration-llama-2-7b-f4096ml.g5.2xlarge
Llama-2-13bmeta-textgeneration-llama-2-13b4096ml.g5.12xlarge
Llama-2-13b-chatmeta-textgeneration-llama-2-13b-f4096ml.g5.12xlarge
Llama-2-70bmeta-textgeneration-llama-2-70b4096ml.g5.48xlarge
Llama-2-70b-chatmeta-textgeneration-llama-2-70b-f4096
請注意,SageMaker 端點有 60 秒的超時限制。因此,即使模型可以生成 4096 個tokens,如果文字生成超過 60 秒,請求將失敗。對於 7B、13B 和 70B 模型,我們建議將 max_new_tokens 設為不大於 1500、1000 和 500,同時將總token數保持在 4K 以下。

Llama-2-70b 的推理和範例提示

您可以將 Llama 模型用於任何文字的文字生成。透過文字生成,您可以執行各種任務,例如回答問題、語言翻譯、情感分析等。以下為端點的輸入有效負載程式碼:
payload = { "inputs": str, (optional)"parameters":{"max_new_tokens":int, "top_p":float, "temperature":float} }
以下是一些範例提示和模型生成的文字。所有輸出都是使用以下推理參數生成的。
{"max_new_tokens":256, "top_p":0.9, "temperature":0.6}
輸入:
Simply put, the theory of relativity states that
輸出
1) the laws of physics are the same for all observers in uniform motion relative to one another, and 2) the speed of light is the same for all observers, regardless of their relative motion or of the motion of the source of the light.
In the late 1800s, scientists believed that light travelled through a medium they called the luminiferous ether. But in 1887, Albert Michelson and Edward Morley designed an experiment to measure the speed of light in the ether, and found that the speed of light was the same no matter how they oriented their equipment. This result seemed to contradict the idea of the ether, and Michelson and Morley’s experiment became one of the most famous failed experiments in history.
In 1905, Albert Einstein published a paper that used the results of the Michelson-Morley experiment to develop the theory of special relativity. In 1915, Einstein published a paper on general relativity, which is a theory of gravitation.
In physics, the speed of light is considered to be a constant, and its value is 299,792,458 metres
輸入
I believe the meaning of life is
輸出
to be happy, to find joy in life, to love and be loved. To have the courage to be yourself and to be kind to others. To be able to express yourself, to create and to be inspired. To be able to give and receive. To be able to learn and grow. To be able to find beauty in the world and to be able to appreciate it. To be able to find peace and contentment. To be able to find meaning and purpose. To be able to make a difference in the world. To be able to find joy in the simple things. To be able to find joy in the difficult things. To be able to find joy in the journey. To be able to find joy in the destination. To be able to find joy in the present moment. To be able to find joy in the future. To be able to find joy in the past. To be able to find joy in the present. To be able to find joy in the future. To be able to find joy in the past. To be able to find joy in the present. To be able to find joy in the future. To be able to find joy in the past. To be able to find joy in the present. To be able to find joy
輸入
A brief message congratulating the team on the launch:

Hi everyone,

I just
輸出
wanted to send a quick note to congratulate everyone on the launch of our new website. I know it’s been a long time coming and I’m sure we’re all relieved that it’s finally out there.

Thanks to everyone for their hard work and dedication over the last few months. I know it wasn’t always easy, but I think we can all be proud of the final product.

I’m looking forward to seeing how our users respond to the new site and I’m confident that it will be a success.

Thanks again, and enjoy the rest of your day!

Best,

[Your Name]

[Your Title]

[Your Company]

[Your Email]

[Your Phone Number]

[Your Website]

[Your Social Media Handles]

[Your Address]

[Your City, State, Zip Code]

[Your Country]

[Your Postal Code]

[Your Website]

[
在下一個範例中,我們展示如何在少量示例的上下文學習中使用 Llama 模型,我們會為模型提供一些訓練訓練樣本。請注意,我們只對已部署的模型進行推理,在此過程中,模型權重不會改變。
輸入
Translate English to French:
sea otter => loutre de mer
peppermint => menthe poivrée
plush girafe => girafe peluche
cheese =>
輸出
fromage
chocolate => chocolat
car => voiture
hamster => hamster
beer => bière
bicycle => bicyclette
banana => banane
strawberry => fraise
pizza => pizza
cheeseburger => cheeseburger
pineapple => ananas
carrot => carotte
watermelon => pastèque
whale => baleine
eggplant => aubergine
cat => chat
penguin => manchot
lion => lion
elephant => éléphant
panda => panda
tiger => tigre
salmon => saumon
squirrel => écureuil
kangaroo => kangourou
turtle => tortue
octopus => poulpe
frog => grenouille
pig => cochon
goldfish => poisson rouge
tiger shark => requin tigre
gorilla => gorille
dol

Llama-2-70b-chat 的推理和範例提示

對於針對對話使用案例進行優化的 Llama-2-Chat 模型,傳送給聊天模型端點的輸入是聊天助理與使用者之間的先前對話記錄。您可以詢問與到目前為止的對話有關的問題。您也可以提供系統配置,例如定義聊天助理行為的個性。傳送給端點的輸入有效負載如下所示:
payload = {
“inputs”: [
[
(optional){"role": "system", "content": str},
(repeated k times) {"role": "user", "content": str}, {"role": "assistant", "content": str},
{"role": "user", "content": str},
]
],
(optional)"parameters":{"max_new_tokens":int, "top_p":float, "temperature":float}
}
以下是一些範例提示和模型生成的文字。所有輸出都是使用推理參數 {"max_new_tokens": 512, "top_p": 0.9, "temperature": 0.6} 生成的。
輸入
User: what is the recipe of mayonnaise?
輸出
Assistant: Sure, I can help you with that! Mayonnaise is a thick, creamy condiment made from a mixture of egg yolks, oil, vinegar or lemon juice, and seasonings. Heres a basic recipe for homemade mayonnaise:

Ingredients:

* 2 egg yolks
* 1/2 cup (120 ml) neutral-tasting oil, such as canola or grapeseed
* 1 tablespoon (15 ml) vinegar or lemon juice
* 1/2 teaspoon salt
* 1/4 teaspoon black pepper

Instructions:

1. In a medium-sized bowl, whisk together the egg yolks, salt, and black pepper until well combined.
2. Slowly pour the oil into the bowl while continuously whisking the mixture. Its important to add the oil very slowly, otherwise the mixture can break and youll end up with a separated, curdled mess.
3. Continue whisking until the mixture thickens and emulsifies, which should take about 5-7 minutes. Youll know its ready when the mixture has doubled in volume and has a thick, creamy consistency.
4. Add the vinegar or lemon juice and continue whisking for another minute or two, until the flavors are fully incorporated.
5. Taste the mayonnaise and adjust the seasoning as needed. You can add more salt, pepper, or acid if desired.
6. Cover the bowl with plastic wrap and refrigerate the mayonnaise for at least 30 minutes before serving. This will allow the flavors to meld together and the mayonnaise to thicken further.

Thats it! This recipe makes about 1 1/2 cups (360 ml) of mayonnaise, which should be enough for a small to medium-sized batch of sandwiches or salads. You can also customize the recipe by adding different flavorings, such as garlic, mustard, or herbs, to create a unique taste. Enjoy!
在以下範例中,使用者已經與助理討論過巴黎的旅遊景點。接下來,使用者詢問助理推建的第一個選項。
輸入
User: I am going to Paris, what should I see?
Assistant: Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the worlds largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, its no wonder that Paris is one of the most popular tourist destinations in the world.
User: What is so great about #1?
輸出
Assistant: The Eiffel Tower is a world-renowned iconic landmark that symbolizes Paris and France. It was built for the Worlds Fair in 1889 and was initially intended to be a temporary structure, but it has become a permanent part of the Parisian skyline. Here are some reasons why the Eiffel Tower is so great:

1. Engineering marvel: The Eiffel Tower was a groundbreaking feat of engineering at the time of its construction. It stands at 324 meters (1,063 feet) tall and was the tallest structure in the world for over 40 years.
2. Iconic design: The Eiffel Towers distinctive shape and design make it instantly recognizable. It has been the subject of countless works of art, photographs, and films, and has become a symbol of Paris and France.
3. Panoramic views: The Eiffel Tower offers breathtaking views of the city from its observation decks. Visitors can enjoy a 360-degree view of Paris, including famous landmarks such as the Seine River, the Arc de Triomphe, and the Montmartre neighborhood.
4. Historical significance: The Eiffel Tower has played a significant role in Pariss history, having been the site of numerous important events, including the famous "Battle of the Eiffel Tower" during World War II.
5. Cultural impact: The Eiffel Tower has had a significant impact on popular culture, inspiring countless works of art, literature, and music. It has also been the subject of numerous documentaries and films, including the classic movie "An American in Paris."
6. Romantic setting: The Eiffel Tower is known for its romantic atmosphere, making it a popular destination for couples and newlyweds. It has been the site of countless marriage proposals and has been featured in numerous romantic films and literature.
7. Accessibility: The Eiffel Tower is easily accessible by public transportation, making it a convenient destination for visitors. It is also wheelchair-friendly, with elevators and ramps providing access to all levels.

Overall, the Eiffel Tower is a must-see destination for anyone visiting Paris, offering a unique and unforgettable experience that combines history, culture, engineering, and rom
在以下範例中,我們設定了系統的配置
輸入
System: Always answer with Haiku
User: I am going to Paris, what should I see?
輸出
Assistant: Eiffel Tower high
Art in the Louvre Museum
River Seine flows
輸入
System: Always answer with emojis
User: How to go from Beijing to NY?
輸出
Assistant: Here are the emojis that represent the steps to go from Beijing to New York:

🇨🇳🛫🛬🇺🇸🗽🏙️🚕💨🛩️🛬🛫😴💤🛩️🛬🛫😍

微調

在機器學習中,將在一個領域學習到的知識轉移到另一個領域的能力稱為遷移學習。您可以使用遷移學習在較小的數據集上產生準確的模型,所需的訓練成本遠低於訓練原始模型所需的成本。使用遷移學習,您可以在 1-2 小時內針對自己的數據集對 Llama 2 模型進行微調。您可以使用領域適應數據集和指令調整數據集來對基礎模型進行微調。目前,您可以在 SageMaker JumpStart 上訓練 Llama 2 7B 和 13B 模型。微調腳本基於該 提供的腳本。它們利用全分片資料並行 (FSDP) 庫以及低秩適應 (LoRA) 方法有效地微調模型。您可以調整 14 個超參數中的任何一個來調整微調以適應您的應用程式。要了解更多資訊,請參閱 SageMaker JumpStart 上微調 LLaMA 2 模型筆記本和 Amazon SageMaker JumpStart 上微調 Llama 2 進行文字生成 部落格文章。
要透過 Studio 訓練模型,您可以在模型卡片上點擊訓練按鈕。它將使用預設資料集和預設參數在 SageMaker 上訓練模型。
https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/11/llama-fine-tuning.jpg

清理

完成筆記本後,請務必刪除所有資源,以刪除您在過程中建立的所有資源,並停止計費
predictor.delete_model()
predictor.delete_endpoint()

結論

在本文中,我們向您展示了如何在 SageMaker Studio 上開始使用 Llama 2 模型。有了這個功能,您可以存取包含數十億參數的六種 Llama 2 基礎模型。由於基礎模型是預先訓練的,它們還可以幫助降低訓練和基礎設施成本,並支援根據您的使用案例進行自訂。要開始使用 SageMaker JumpStart,請訪問以下資源:
關鍵字
大家都在看